Monte Carlo calculations allow us to use random numbers to find approximate solutions to mathematical problems that may otherwise be too complicated to solve
There are many different ways in which this can help
The fundamental issue around all Monte Carlo analysis is that the results are approximate
The Law of Large Numbers states that if a large number of random events occur then the average outcome will converge on the expected outcome
It is also known as the law of averages
Sometimes it is used when to few trials have been used - this is known as Gambler's fallacy
If 1000 random numbers are drawn from the distribution $N(48,20^2)$ what is the probability that the average lies between 47 and 49
The sum of 1000 trials is $~N(48000, 1000 \times 20^2)$
So $P(47 < Average < 49) = P(47000 < Sum < 49000)$
= $(47000 < N(48000, 1000 \times 20^2) < 49000)$
= $1 - 2 \times P(N(48000, 1000 \times 20^2) < 47000)$
= $1- 2 \times P \left(N(0,1) < \frac{47000-48000}{\sqrt{400,000}}\right)$
= $1- 2 \times P \left(N(0,1) < -1.58114 \right) = 0.8862$
If random numbers are drawn from the distribution $U(0,1)$ how many have to be drawn before the average is between 0.4999 and 0.5001 with a probability of 95%
Clearly $E(U(0, 1)) = 0.5$ but what is $Var(U(0,1))$
We use $Var(X) = E(X^2) - (E(X))^2$
$Var(X) = \displaystyle\int_0^1 x^2 dx - (0.5)^2$
$var(X) = \left[\frac{x^3}{3}\right]_0^1 - \frac{1}{4}$
$var(X) = \frac{1}{3} - \frac{1}{4} = \frac{1}{12}$
Then we need to assume that the central limit theorem will apply and so the sum of n trials will be:
$Sum \sim N\left(\frac{n}{2}, \frac{n}{12} \right)$
The average of n trials will be
$Sum\sim N\left(\frac{1}{2}, \frac{1}{12n} \right)$
By symmetry we need to solve:
$P \left(N\left(\frac{1}{2}, \frac{1}{12n} \right) < 0.4999 \right) = 0.025$
$=P \left(N\left(0, 1 \right) < \frac{0.4999 - \frac{1}{2}}{\sqrt{\frac{1}{12n}}} \right) = 0.025$
$\therefore \frac{0.4999 - \frac{1}{2}}{\sqrt{\frac{1}{12n}}} = -1.96$
$\therefore 0.0001 = 1.96 \times \sqrt{\frac{1}{12n}}$
$\therefore n=\frac{1}{12} \left(\frac{1.96}{0.0001} \right)^2 = 32,013,333$
How many trials would we need to for the average to be between 0.49 and 0.51 with a probability of 95%
$\therefore n=\frac{1}{12} \left(\frac{1.96}{0.01} \right)^2 = 3,201$
If we do 10,000 trials what accuracy would we expect?
Solve $\therefore n=\frac{1}{12} \left(\frac{1.96}{A} \right)^2 = 10,000$
$A = \frac{1.96}{\sqrt{12000}} = 0.005658$
therefore the 95% confidence interval is $(0.4943, 0.5057)$
In Excel we use the function rand() to generate a random number from $U(0,1)$
In VBA we use the function rnd() to generate a random number from $U(0,1)$
Test the above calculation by finding the average of 3,201 random numbers generated from $U(0,1)$ and test how many times it falls between 0.49 and 0.51.
Spreadsheet is here
We use the inverse CF method to generate random variables from different probability distributions given we can start from the uniform distribution
The steps are:
The diagrams below illustrate this:
Write a function which generates numbers randomly from a Weibull distribution
Remember the pdf is $f(x)=\frac{k}{\lambda} \left(\frac{x}{\lambda}\right)^{k-1}e^{-(x/\lambda)^k}$, for x >= 0 and 0 elsewhere
Hint: Use the substitution $u=e^{-(x/\lambda)^k}$ to calculate $F(x)$
This spreadsheet shows the method and compares the results with the theoretical pdf of the Weibull distribution
Can we apply the inverse c.d.f. method to the normal distribution?
No - because we cannot integrate the p.d.f
Fortunately there is an algorithm which we can use to produce randomly distributed normal variables
This is called the Box-Muller algorithm and it produces pairs of independent random variables from the standard normal distribution: $N(0,1)$
Let $U_1$ and $U_2$ be r.v.s from the uniform distribution $U(0,1)$ then if
$Z_1 = \sqrt{-2 ln U_1} cos (2 \pi U_2)$ and
$Z_2 = \sqrt{-2 ln U_1} sin (2 \pi U_2)$
then $Z_1$ and $Z_2$ are independent random variables both with a distribution of $N(0,1)$
Write a VBA routine to produce normal random variables using the Box-Muller algorithm
A spreadsheet can be found here
Is there a short cut we can use in excel?
Yes we can use normsinv(rand())
Is this as good as the Box Muller approach
No it is not as the normsinv function just uses a gaussian quadrature and is an approximation. (Albeit a very precise one)
How could we generate random numbers from the distribution $N(\mu, \sigma^2)$
Generate r.v. from $X \sim N(0,1)$ and then use $Y=\sigma \times X + \mu$
How would we generate lognormal r.v.s such as $Y \sim ln N(\mu, \sigma^2)$
First generate $X \sim N(0,1)$ then $Y= e^{\sigma \times X + \mu}$
The proof is motivated by thinking about polar coordinates as random variables and then mapping onto Cartesian coordinates.
We set up $R$ and $\Phi$ as random variables and note that:
$X=R cos \Phi$ and $Y=R sin \Phi$
we define $R=\sqrt{-2 ln (U_1)}$
so $P(R \le r)=P(-2 ln(U_1) \le r^2)$
$=P(ln(U_1) \ge -\frac{r^2}{2})$
$=1-P \left(U_1 \lt exp \left( - \frac{r^2}{2} \right) \right)$
As $U_1$ is uniform on $U(0,1)$ so we can say
$P(R \le r) = 1 - \displaystyle\int_0^{exp(-r^2/2)} dt = 1 - exp \left(-\frac{r^2}{2} \right)$
So we can differentiate the cdf to get the pdf and we see
$f_R(r)=exp\left(-\frac{r^2}{2} \right) \times r$ for $r \gt 0$
We define a new random variable $\Phi$ to be $\Phi=2 \pi U_2$
and we can see that: $f_{\Phi}(\phi)=\frac{1}{2 \pi}$ for $0 \lt \phi \le 2 \pi$
Given $U_1$ and $U_2$ are independent then so must $R$ and $\Phi$ be
so we can multiply pdf to get joint density functions so
$f_{R,\Phi}(r,\phi)=f_R(r) f_{\Phi}(\phi)=\frac{1}{2 \pi} exp \left( - \frac{r^2}{2} \right) . r$
so the probability that an event lies in a little bit of the $R - \Phi$ plane is:
$P=\frac{1}{2 \pi} exp \left( - \frac{r^2}{2} \right) . r .dr.d\phi$
But we are interested in random variables over the $X-Y$ plane not the $R-\Phi$ plane
So we need to equate probabilities for a little bit of the area of the $R-\Phi$ plane and the $X-Y$
i.e. $\frac{1}{2 \pi} exp \left( - \frac{r^2}{2} \right) . r .dr.d\phi = f_{X,Y}(x,y) dx dy$
geometrically we can see that $dr d\phi$ describes an area length $dr$ and width $r \times d\phi$ that is an area of $r \times dr d\phi$, whereas $dx dy$ simply describes an area of $dx \times dy$
So to transform from the $R-\Phi$ plane to the $X-Y$ plane we simply divide the pdf by $r$ and we have:
$f_{X,Y}(x,y) = \frac{1}{r} \times \frac{1}{2 \pi} exp \left( - \frac{r^2}{2} \right) . r$
$\therefore f_{X,Y}(x,y) = \frac{1}{2 \pi} exp \left( - \frac{r^2}{2} \right)$
$\therefore f_{X,Y}(x,y) = \frac{1}{2 \pi} exp \left( - \frac{x^2+y^2}{2} \right)$
$\therefore f_{X,Y}(x,y) = \sqrt{\frac{1}{2 \pi}} exp \left( - \frac{x^2}{2} \right). \sqrt{\frac{1}{2 \pi}} exp \left( - \frac{y^2}{2} \right)$
and so we have both $X$ and $Y$ are normally distributed and independent.
Where $X=R cos \Phi = \sqrt{-2 ln (U_1)} cos (2 \pi U_2)$
and $Y=R sin \Phi = \sqrt{-2 ln (U_1)} sin (2 \pi U_2)$
You will see in this proof echoes of the proof (page 1 and page 2) that the area under the normal distribution is 1
Computers cannot generate true random numbers as they only perform deterministic calculations.
If you really need true random numbers (and we generally do not) then this company takes atmospheric data to produce true random numbers
For our purposes we use pseudo random numbers which computers can generate.
The standard algorithm is called a linear congruential generator
These are defined by a recurrence relationship such as:
$X_{n+1} = (a X_n + c)\ mod\ m$
The choice of $a,m$ and $c$ is critical to the effective working of the method.
Let's build a spreadsheet and see how this works
What factors do you think would make a good choice of $a,m$ and $c$
Try experimenting with different values until you have produced a good LCG
The random number generator in excel is also a pseudo random number generator but we can use it as if it produces true random numbers
If you roll 4 dice what is the probability that the sum is 7
Try to do this calculation with a traditional probability method
Now try to use a Monte Carlo method
Answers in attached spreadsheet
The area of a circle is $A=\pi r^2$
Can you use this to develop a Monte Carlo algorithm to calculate $\pi$
When we price options we can use Monte Carlo methods but we can also use the Black-Scholes formula.
Why can two such different methods do the same thing
Can you use a Monte Carlo method to find:
a) $\displaystyle\int_0^{\pi/2} sin (x) dx$
b) $\displaystyle\int_0^1 e^{-x^2/2} dx$
Is this a sensible method to use for these integrations?
a) No - this can be solved analytically
b) No - a quadrature technique should be used here
What about functions which cannot be solved analytically and would be unstable if quadrature techniques were applied
For example how about the calculation:
$\displaystyle\int_0^1 sin \left( \frac{1}{x} \right) dx$
Here you will find a Monte Carlo technique will give you a robust but approximate way of calculating the integral
Checking for randomness is just hypothesis testing where the null hypothesis is:
$H_0$: The numbers are random
$H_1$: They are not random
So to implement such a hypothesis test we need to think of features we would expect to see in truly random numbers
Do we need to check for randomness of normal variables if we use Box-Muller or other distribution if we use the inverse CDF method
No - as these begin with the $U(0,1)$ if we have properly verified randomness on $U(0,1)$ then randomness on the other distributions will follow
On average we would expect the proportion of random numbers $X \sim U(0,1)$ that are less than $c$ to be $c$
We need to develop this into a more formal statistical test though so we realise than the number of random number less than $c$ will be binomially distributed
$X \sim Bin(N,c)$ where:
Given we are going to test with very large values of $N$ we can assume a normal distribution as so we assume
$X \sim N(Nc, Nc(1-c))$
Suppose we run 10000 simulations and we find that 5200 of the random number are less than 0.5. Do we accept or reject the null hypothesis that our numbers are random
We expect $X \sim N(5000, 2500)$ and so our 95% confidence interval is $(5000 - 1.96 \times \sqrt{2500}, 5000 + 1.96 \times \sqrt{2500}) = (4902, 5098)$
How many numbers fall within a given set
We can easily extend the above test to any set on the interval $U(0,1)$
Why is this important
A sequence like 0.4, 0.6, 0.4, 0.6, 0.4, 0.6 will fool the above test
Suppose we run 1,000,000 simulations from $U(0,1)$ and 803,000 of them fall between 0.1 and 0.9
Falling between 0.1 and 0.9 has a binomial distribution
$X \sim Bin(1000000 \times 0.8, 1000000 \times 0.8 \times 0.2) = N(800,000, 400^2)$
So again our 95% confidence interval is
$(800,000 - 1.96 \times 400, 800,000 + 1.96 \times 400) = (799216, 800784)$.
and we can reject $H_0$
Why might this test be particularly useful
We might be most concerned with the accuracy of our tail distribution analysis
We could also test how many random numbers are in the set $[0,0.1) \cup [0.2,0.3) \cup [0.4, 0.5) \cup [0.6, 0.7) \cup [0.8, 0.9)$
That is: do 50% of the random numbers have a first decimal place that is even
easily we could extend this to counting how many random number have a third decimal place which is even
How would you test for the third decimal place being even
if (x*500 - int(x*500) < 0.5) then
--- endifWe should not under-estimate the power of the human eye to detect problems with our random numbers
Many patterns that are difficult to detect by statistical methods will be instantly visible to the human eye
The method is very simple:
In purely statistical terms this is a mere extension of the above and does not really add anything, but any bias or pattern will be clearly visible
We cannot use this type of test to measure the extent of the randomness but only to detect any obvious unrandom patterns
Design an LCG: choose $m,a,c$ and $X_0$
Plot a histogram of the returns to see if any obvious non - random features are visible
We can extend the above to look at the distribution of the frequency that numbers fall with each interval.
This is useful because simply counting across the interval $U(0,1)$ would fool the above test if it was done in sufficiently small intervals
We need an array to count the numbers of times we fall in each interval.
Supposing we subdivide the interval $U(0,1)$ into 100 intervals $U(\frac{n-1}{100}, \frac{n}{100})$
We then need to group our data into a frequency array
We finally need to compare this frequency array with the normal distribution we would expect to get if true randomness were being observed
This is slightly confusing as there is effectively a double frequency count going on. Lets look at an example
If we do 1000000 (1m) random numbers then we expect the number that are in the interval (0.230,0.231) to be $Bin(1000000, 0.001)$ or approximately $ N(1000, 999)$
In fact this is the same distribution for all the intervals
So we have 1000 numbers all from $ N(1000, 999)$
We then group these into a histogram and compare with the normal distribution
Design an LCG: choose $m,a,c$ and $X_0$
Perform the above statistical test, which should result in a spreadsheet with your histogram with the normal distribution overlaid on top
You could test whether individual frequencies in your histogram fell within an appropriate confidence interval
You could check that the number of frequencies that fell with a 95% confidence interval was about 95%
You could do a chi-squared test on the actual versus expected from the histogram
For truly random numbers each number will be completely independent of the previous one
Thinking back to the definition of independence
$f_{X,Y}(x,y) = f_X(x) \times f_Y(y) \space \forall x, y$
We also need independence of all preceding random numbers not just the previous one so our definition must extend to:
$f_{X_1,X_2,...,X_n}(x_1,x_2,..,x_n) = f_{X_1}(x_1) \times f_{X_2}(x_2) \times ... \times f_{X_n}(x_n) \space$
$\forall x_1, x_2, ..., x_n$
Really this is just another way of saying that whatever the values of $x_1, x_2, ..., x_{n-1}$ are the values of $x_n$ should be unaffected
We can test for this by selecting subsets of our random numbers $x_n$ such that the previous numbers $x_1, x_2, ..., x_{n-1}$ have any given property and do all the same randomness tests on this subset of $x_n$
Suppose there is a strong serial correlation between successive values in our random number generator.
values of $x_n \gt 0.5$ would be more likely when the previous value was $x_{n-1} \gt 0.5$
selecting only those values of $x_n$ for which $x_{n-1} \gt 0.5$ and then testing that the proportion of these values is $Bin(n, 0.5)$ would cause a fail in $H_0$ of independence
You are concerned that the random number generator tends to pick a number less than the previous one if the previous two numbers were higher than the one which preceded them
In the middle of your test algorithm:
do n=n+1 new_num(n) = my_LCG() loop until ((new_num(n-1)>new_num(n-2)) and (new_num(n-2)>new_num(n-3)))
new_num(n) is now in your subset of random numbers on which to perform tests
If $x_{n-1} \gt x_{n-2}$ what is $P(x_n > x_{n-1})$
One third
The big problem with Monte Carlo is that it is always an approximation
The solution is to do lots of runs
The problem with lots of runs is that they take time
However we can increase the rate of convergence using certain techniques which we call variance reduction techniques
Suppose I wish to calculate the expected value of $U(0,1)$
I could take 1000 random numbers from $U(0,1)$ and take the average of them
The standard deviation of my error would be $\frac{1}{\sqrt{12} \times \sqrt{1000}} = 0.009129$
The problem I have here is that I am taking numbers randomly from all over the line.
Suppose I wanted to do something more akin to numerical integration where we would go step by step across the interval adding up each number
Take a random number from $U(0,0.1)$ and then synthesise a number from each of $U(0.1, 0.2)$ etc by adding $0.1$ then $0.2$ etc to the original number
If we repeat this 100 times so that we have a total of 1000 random numbers (100 random + 900 generated) then
The variance of the first number is $\frac{1}{12} \times 0.1^2 = \frac{1}{1200}$
The variance of the average of the first 10 numbers is then the same: $\frac{1}{12} \times 0.1^2 = \frac{1}{1200}$
The variance of the 1000 numbers is then $\frac{1}{100} \times \frac{1}{12} \times 0.1^2 = \frac{1}{120000}$
and the standard error is $\sqrt{\frac{1}{120000}}=0.002887$
So the standard error is reduced by a factor of $\sqrt{10}$
This spreadsheet performs the experimental confirmation of this
Sometimes you may want to 'crawl' along a Brownian motion to check for features such as knockout options.
However this will slow down your calculation.
Brownian Bridge is a way of jumping to the end of the calculation and then going back and calculating interpolating values. You can then condition doing this on whether the final value is such that you believe a knockout value may have been hit.
Let $W_t$ be a standard Wiener process so $W_0=0$ and $W_t \sim N(0,t)$ and specifically $W_1 \sim N(0,1)$
Now suppose we generate one number for $W_1$ from the distribution $N(0,1)$ and this is $b$
And then we want to simulate a value for $X_{\frac{1}{2}}$ which is now our new Brownian motion conditioned on our value of $X_1=b$. We can directly prove how to do this but the easiest method is to propose the 'answer' and then show it works
So propose that $X_{\frac{1}{2}} | _{X_0=0, X_1=b} \sim \frac{b}{2} + Y$ where $Y \sim N(0,\frac{1}{4})$
We note here that the variance of $Y$ is $\frac{1}{4}$ NOT $\frac{1}{2}$ as we might instinctively expect for a $t=\frac{1}{2}$ Brownian motion. This is because this Brownian motion is constrained at both ends
Now lets remove $X_1=b$ from the filtration and consider just:
$X_{\frac{1}{2}} |_{ X_0=0} = \frac{1}{2}W_1 + Y \sim N(0,\frac{1}{4} + \frac{1}{4}) = N(0, \frac{1}{2})$
So $X_{\frac{1}{2}} |W_0 \cong W_{\frac{1}{2}}$
Hence we can say our value so constructed is indistinguishable from a directly calculated Wiener process value
The propose a solution and show it is right method seems a bit 'low brow' but it is perfectly valid and rigorous.
However there is one aspect of this problem we have missed:
We have not shown that the step from $X_{\frac{1}{2}}$ to $X_1$ is also a Wiener increment. This is easy to show and it is.
The same method works for other intervals
For the same problem of $W_1=b$ we can find any $W_t$ for $0 < t < 1$ by simulating $X_t = t + Y$ where $Y \sim N(0,t(1-t))$
You can show this as an extension exercise